Using LSI to evaluate the quality of hypertext links

نویسندگان

  • James Blustein
  • Robert E. Webber
چکیده

Useful hypertext is constrained by the need for users to be able to nd documents about similar topics without extensive navigation. We show how examining the properties of a graph built by a document's hypertext links can be used to evaluate the usefulness of the document. To formally measure the quality of hypertext linking in a corpus, we compare the semantic similarity of pairs of documents with the minimum number of links between their corresponding nodes in an analogous hypertext graph. We use the measure of document-todocument similarity computed using latent semantic indexing as our measure of semantic similarity. Our method has been applied to a corpus composed of Usenet messages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CLUSTERING HYPERTEXT WITH APPLICATIONS TO WEBSEARCHINGDharmendra

This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and speciic requests. After outside...

متن کامل

Rule-Based Machine Learning of Hypertext Links

Voluminous documentations could be more advantageously used if they were represented as hypertexts because they are usually not read sequentially but rather browsed for those text passages that contribute to the solution of a certain problem. But since those documents are available very often in linear form only, methods to convert linear documents into hypertexts automatically are desirable. T...

متن کامل

Evaluation of corrosion and scaling potential of drinking water supply sources of Marivan villages, Iran

Corrosion and scaling in drinking water sources can lead to economic and health damages. These processes produce by-products in distribution systems, reduce chemical water quality, and are the cause of health issues among consumers. The aim of this study was to evaluate the corrosion and scaling potential of water supply sources of Marivan villages, Iran. In total, 106 water samples were collec...

متن کامل

Automatic Hypertext Link Typing

We present entirely automatic methods for gathering documents for a hypertext, linking the set, and annotating those connections with a description of the type (i. e., nature) of the link. Document linking is based upon high-quality information retrieval techniques developed using the Smart system. We apply an approach inspired by relationship visualization techniques and by graph simplificatio...

متن کامل

Automatically generating hypertext by computing semantic similarity

We describe a novel method for automatically generating hypertext links within and between newspaper articles. The method is based on lexical chaining, a technique for extracting the sets of related words that occur in texts. Links between the paragraphs of a single article are built by considering the distribution of the lexical chains in that article. Links between articles are built by consi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995